Implementing the PPM data compression scheme

نویسنده

  • Alistair Moffat
چکیده

The “Prediction by Partial Matching” (PPM) data compression algorithm developed by Cleary and Witten is capable of very high compression rates, encoding English text in as little as 2.2 bits/character. Here it is shown that the estimates made by Cleary and Witten of the resources required to implement the scheme can be revised to allow for a tractable and useful implementation. In particular, a variant is described that encodes and decodes at over 4 kbytes/s on a small workstation, and operates within a few hundred kilobytes of data space, but still obtains compression of about 2.4 bits/character on

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extended Application of Suffix Trees to Data Compression

A practical scheme for maintaining an index for a sliding window in optimal time and space, by use of a suffix tree, is presented. The index supports location of the longest matching substring in time proportional to the length of the match. The total time for build and update operations is proportional to the size of the input. The algorithm, which is simple and straightforward, is presented i...

متن کامل

Unbounded Length Contexts for PPM

The PPM data compression scheme has set the performance standard in lossless compression of text throughout the past decade. PPM is a "nite-context statistical modelling technique that can be viewed as blending together several "xed-order context models to predict the next character in the input sequence. This paper gives a brief introduction to PPM, and describes a variant of the algorithm, ca...

متن کامل

A Single Core Hardware Module of a Data Compression Scheme Using Prediction by Partial Matching Technique

Problem statement: Compression is useful because it helps reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth. For effective data compression, the compression algorithm must be able to predict future data accurately in order to build a good probabilistic model for compression. Lossless compression is essential in cases where it is important that the ...

متن کامل

Generic Adaptive Syntax-Directed Compression for Mobile Code

We propose a new scheme for compressing mobile programs. Our proposal is meant as part of a larger infrastructure for code distribution and deployment. In this paper we show how to effectively compress programs on the source level by compressing abstract syntax trees (ASTs) which are equivalent to source code (modulo comments and layout). We compress ASTs by adapting the wellknown PPM (predicti...

متن کامل

Text Compression using Recency Rank with Context and Relation to Context Sorting, Block Sorting and PPM*

Recently block sorting compression scheme was developed and relation to statistical scheme was studied, but theoretical analysis of performance has not been studied well. Context sorting is a compression scheme based on context similarity and it is regarded as an online version of the block sorting and it is asymptotically optimal. However, the compression speed is slower and the real performan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Communications

دوره 38  شماره 

صفحات  -

تاریخ انتشار 1990